An Intelligent Search Infrastructure for Language Resources on the Web
نویسندگان
چکیده
Language occupies a central role on the web: most content is expressed in a given language, and most access takes place via natural language input and interfaces. Today, investigation of human language in all its forms depends on access to this vast store of language data. In particular, linguists and language technologists annotate and analyze this data and develop new language resources including grammars, dictionaries, and a raft of new technologies for automatic translation, information extraction, question answering, and so forth. As this new documentation is disseminated on the web, and as the new technologies are in turn deployed on the web, a further round of collection and processing is enabled, closing the loop. For instance, a collection of Japanese text with an aligned English translation can be used for translation studies, for adding examples to bilingual dictionaries, and developing translation systems. These resources can then be used for new purposes, e.g. to provide English speakers access to content stored in Japanese text, or to provide Japanese learners of English with more authentic example sentences.
منابع مشابه
A Large-Scale Web Data Collection as a Natural Language Processing Infrastructure
In recent years, language resources acquired from theWeb are released, and these data improve the performance of applications in several NLP tasks. Although the language resources based on the web page unit are useful in NLP tasks and applications such as knowledge acquisition, document retrieval and document summarization, such language resources are not released so far. In this paper, we prop...
متن کاملThe state-of-the-art in web-scale semantic information processing for cloud computing
Based on integrated infrastructure of resource sharing and computing in distributed environment, cloud computing involves the provision of dynamically scalable and provides virtualized resources as services over the Internet. These applications also bring a large scale heterogeneous and distributed information which pose a great challenge in terms of the semantic ambiguity. It is critical for a...
متن کاملIntelligent Health Solution System
Introduction: In the field of management, the statistics and performance of the deputies and functions of the organization are always of great importance, which requires instant access to the latest status of the system under coverage and minimal forecast of the future situation, to provide quality services Also improve. All of this justifies the existence of an intelligent statistical system w...
متن کاملExpert Knowledge Management based on Ontology in a Digital Library
The architecture of the future Digital Libraries should be able to allow any users to access available knowledge resources from anywhere and at any time and efficient manner. Moreover to the individual user, there is a great deal of useless information in addition to the substantial amount of useful information. The goal is to investigate how to best combine Artificial Intelligent and Semantic ...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کامل